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This  research  aimed  to  develop  a  field  sensing  system  capable  of  performing  three 
dimensional  (3D)  field  mapping  for  measuring  crop  height  and  volume  and  detecting  crop 
rows  in  3D  for  reliable  tractor  guidance  using  one  tractor-mounted  stereocamera.  The  core 
of  the  dual-application  field  sensing  system  is  a  stereovision-based  mapping  method  for 
creating  3D  crop  structure  maps  by  means  of  estimating  the  motion  of  tractor-mount 
stereocamera  and  stitching  constituent  stereoimages  progressively.  By  this  approach, 
a  feature  point  tracking  algorithm  was  used  to  extract  tractor  motion  indicating  feature 
points  from  constituent  stereoimages,  and  then  feed  the  outcomes  to  a  mathematical 
model  of  tractor  dynamics  to  estimate  tractor  travelling  speed  and  heading  direction.  Field 
validation  test  results  indicated  that  the  developed  system  could  accurately  create  a  3D 
crop  row  structure  map  that  can  truthfully  represent  the  spatial  variability  of  crop  height 
and  volume  from  acquired  stereoimages  and  can  be  used  for  reliably  guiding  the  tractor  by 
following  crop  rows. 

©  2008  IAgrE.  Published  by  Elsevier  Ltd.  All  rights  reserved. 


1.  Introduction 

Crop  growth  condition  monitoring  and  automated  vehicle 
guidance  are  two  major  operations  in  mechanized  precision 
agriculture.  In  recent  years,  commercial  GPS-based  guidance 
tractors  and  yield  mapping  systems  have  been  made  available 
by  major  agricultural  machinery  manufactures.  Both 
applications  are  towards  a  similar  goal— improving  operation 
efficiency  and  productivity.  However,  such  GPS-based 
technologies  have  a  common  limitation  on  obtaining  local 
condition  awareness  which  is  often  very  important  for  per¬ 
forming  efficient  automated  field  operations. 

Machine  vision  has  been  widely  utilized  as  a  condition 
awareness  sensor  because  it  offers  the  ability  to  instantly 
assess  the  target,  and  it  does  so  in  a  non-destructive  manner. 


Significant  research  has  already  been  carried  out  in  the 
machine  vision-based  guidance  systems  and  field  sensing 
systems  (Reid  and  Searcy,  1987;  Tillett  et  al,  2001;  S0gaard  and 
Olsen,  2003).  Field  images  acquired  from  vehicle-mounted 
cameras  provided  baseline  data  for  both  operations.  However, 
previous  studies  have  conducted  these  operations  indepen¬ 
dently  (Tillett  and  Hague,  1999;  Tillett  et  al,  2002;  Benson  et  al, 
2003;  Schleicher  et  al,  2003).  The  research  introduced  herein 
aimed  to  explore  a  machine  vision-based  field  recognition 
system  capable  of  agricultural  vehicle  navigation  and  three 
dimensional  (3D)  crop  mapping  using  one  platform. 

Significant  research  has  also  been  conducted  in  remote 
sensing  for  agriculture  (Yang  et  al,  2001;  Han  et  al,  2002;  Noh 
et  al,  2005).  Many  different  spectral  vegetation  indices  based 
on  remote  sensing  data  have  been  developed  to  estimate 
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Nomenclature 

lateral  camera  position  in  vehicle  coordinates, 

m 

dy 

longitudinal  camera  position  in  vehicle 
coordinates,  m 

dz 

vertical  camera  position  in  vehicle  coordinates, 

m 

h 

heading  angle  of  the  tractor,  deg 

l 

translational  distance  of  the  tractor,  m 

m 

motion  vector  of  the  tractor 

N 

number  of  pair  of  corresponding  point 

Q 

error  function 

Rx(0) 

rotation  matrix  around  Xv  axis 

Ry(ff) 

rotation  matrix  around  Yv  axis 

Rz(&) 

rotation  matrix  around  Zv  axis 

Xc 

X  axis  of  camera  coordinates 

XCYCZC 

camera  coordinates 

Xv 

X  axis  of  vehicle  coordinates 

XVYVZV 

vehicle  coordinates 

Yc 

Y  axis  of  camera  coordinates 

Yv 

Y  axis  of  vehicle  coordinates 

Zc 

Z  axis  of  camera  coordinates 

Zv 

Z  axis  of  vehicle  coordinates 

X 

x  coordinate 

y 

y  coordinate 

z 

z  coordinates 

a 

tilt  angle  of  the  camera,  deg 

(3 

roll  angle  of  the  camera,  deg 

0 

rotational  angle  of  the  tractor,  deg 

y 

pan  angle  of  the  camera,  deg 

0 

rotational  angle,  deg 

Subscripts 

c 

camera  coordinates 

i 

image  frame  number 

V 

vehicle  coordinates 

0 

image  0 

1 

image  1 

biophysical  parameters,  such  as  the  height,  canopy  area, 
volume,  and  leaf  area  index  (LAI)  (Thenkabail  et  a I.,  2000; 
Bajwa  and  Tian,  2001;  Goel  et  al,  2003;  Payero  et  al,  2004). 
However,  these  methods  could  not  determine  3D  parameters 
of  vegetation  precisely  without  additional  information,  such 
as  plant  model,  weather  history,  and  soil  type  (Ji  and  Peters, 
2007).  In  contrast,  a  stereovision  system  can  obtain  direct 
measurement  of  3D  vegetation  structure  along  with  the 
spectral  information.  The  additional  dimension  of  the  scene 
provides  some  critical  information  for  many  agricultural 
applications,  such  as  crop  growth  condition  observation  and 
physical  parameter  estimation  (Lines  et  a I.,  2001;  He  et  a I., 
2003;  Rovira-Mas  et  al,  2005),  as  well  as  livestock  3D  shape 
extraction  (Wu  et  al,  2004). 

The  objective  of  this  research  was  to  explore  a  dual  appli¬ 
cation  of  a  stereovision-based  field  condition  awareness 
system  that  would  be  capable  of  performing  3D  crop  row 
structure  map  creation  and  detecting  crop  row  locations  for 
navigating  a  tractor  in  field.  While  the  crop  row  detection 


method  for  tractor  navigation  has  been  developed  previously 
(Kise  et  al,  2005),  and  validated  on  separate  field  tests  using 
the  sole-function  guidance  system,  this  paper  will  focus  on 
reporting  the  3D  crop  row  structure  mapping  method  using 
the  same  hardware  system  developed  for  the  tractor  naviga¬ 
tion  application. 


2.  Materials  and  methods 

2.1.  Hardware  system 

A  stereocamera  (STH-MD1,  VidereDesign,  Menlo  Park,  USA) 
was  installed  in  front  of  the  test  platform  tractor  (John  Deere 
7700,  John  Deere,  Moline,  USA)  approximately  2.2  m  above 
ground  level  and  tilting  downward.  The  tractor  platform  was 
modified  by  installing  an  electrohydraulic  steering  system  in 
parallel  to  an  existing  hydraulic  steering  mechanism  for 
implementing  automated  steering.  The  stereocamera  used  in 
this  research  consisted  of  two  identical  monochrome  cameras 
with  two  image  sensors  aligned  in  parallel  with  each  other  at 
a  230  mm  separation.  Each  monochrome  camera  had  a  1.3 
mega  pixel  CMOS  imager  and  a  12  mm  lens.  With  the  given 
configuration,  the  stereocamera  could  capture  3D  field  images 
between  approximately  2  m  and  10  m  in  front  of  the  moving 
tractor.  An  RTK-GPS  and  a  fibre  optical  gyroscope  (FOG)  were 
installed  on  the  tractor  to  record  the  travel  path  and  direction 
of  the  vehicle  for  validation  purposes. 

2.2.  3D  crop  row  structure  mapping  algorithm 

3D  crop  row  structure  map  creation  was  performed  by 
stitching  consecutive  images  from  a  vehicle-mounted  ster¬ 
eocamera  progressively.  Fig.  1  shows  the  basic  procedures  of 
3D  crop  structure  map  creation.  The  first  step  was  stereo 
image  acquisition  and  processing  that  computes  the 
disparity  image  to  obtain  3D  information  of  the  field  scene. 
It  was  followed  by  the  3D  crop  row  image  creation,  in  which 
a  disparity  image  was  converted  to  3D  elevation  data  of 
a  small  patch  in  front  of  the  platform  tractor  with  respect  to 
the  vehicle  coordinates.  The  feature  point  extraction  is  an 
image  process  that  extracts  distinctive  features  such  as 
edges  by  applying  a  neighbourhood  operator  to  the  image. 
The  subsequent  feature  point  tracking  compares  two  adja¬ 
cent  images  and  finds  corresponding  feature  points.  The 
camera  motion,  that  is  the  motion  of  the  tractor  platform,  is 
then  estimated  from  the  result  of  the  feature  point  tracking. 
Constituent  images  were  then  projected  on  the  field  image 
coordinates  progressively,  based  on  the  vehicle  motion 
obtained  from  the  preceding  process.  All  algorithms  were 
developed  in  C++  environment  (Microsoft  VisualStudio.NET 
2003,  Redmond,  USA). 

2.3.  3D  crop  row  image  creation 

A  3D  crop  row  image  represents  the  elevation  of  a  small  patch 
in  front  of  the  platform  vehicle,  and  is  baseline  data  for  both 
guidance  and  3D  crop  row  structure  mapping  applications.  In 
the  guidance  application,  the  relative  position  of  the  vehicle  to 
crop  rows  was  measured  by  detecting  crop  rows  based  on  the 
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Fig.  1  -  Flowchart  of  3D  crop  row  structure  map  creation. 


height  difference  between  crop  rows  and  the  ground  in  the  3D 
crop  row  image.  In  3D  crop  structure  mapping  application,  3D 
crop  row  images  collected  from  an  entire  field  were  integrated 
into  one  map.  A  3D  crop  row  image  was  created  from  a  single 
disparity  image.  Fig.  2  shows  an  example  of  a  3D  crop  row 
image  created  from  a  stereo  image  taken  in  a  soya  bean  field. 
As  shown  in  Fig.  2(a),  the  original  field  scene  image  repre¬ 
sented  soya  bean  rows  with  crop  height  approximately  0.40  m 
and  row  spacing  of  0.75  m.  The  disparity  map  of  stereo  image 
(Fig.  2(b))  encoded  the  depth  of  a  scene,  therefore,  provided  the 
baseline  data  for  the  3D  crop  row  image  creation.  The  grey 
level  of  the  3D  crop  row  image  (Fig.  2(c))  indicates  the  height  at 
the  corresponding  location  in  the  field  as  a  brighter  pixel 
represents  higher  point.  The  resolution  of  the  3D  crop  row 
image  was  150  pixels  x  150  pixels,  which  was  equivalent  to 
a  3  m  x  3  m  square  area  in  the  field.  With  the  particular  stereo 
system  arrangement,  the  height  resolution  of  the  map  was 
approximately  5  mm. 


A  3D  crop  row  image  creation  algorithm  was  formulated  by 
defining  the  vehicle  coordinate  system  with  its  origin  at  the 
vehicle  centre  of  gravity  (CG),  as  illustrated  in  Fig.  3.  The  Xv 
and  Yv  axes  were  defined  as  the  vehicle  lateral  and  longitu¬ 
dinal  directions  and  the  Zv  axis  was  defined  as  the  vertical, 
respectively.  To  compensate  for  camera  mount  angle,  a  3D 
point  obtained  from  a  disparity  image  with  respect  to  the 
camera  coordinates  (xc,  yc,  zc)  was  transformed  to  the  vehicle 
coordinates  (xv,  yv,  zv)  by 

xv  \  /  xc  \  /  dx  \ 

yv  I  =  Rz(y)Rx(— a)Ry(^)  I  zc  J  +  I  dy  I  (1) 

zv/  V-yc  /  \dj 

where  a ,  (3,  and  y  are  the  tilt,  roll,  and  pan  angles  of  the  camera 
mount  (dx,  dy,  dz)  is  the  camera  position  offset  to  the  tractor 
CG.  Rx,  RY  and  Rz  are  rotation  matrices  around  Xv,  Yv,  and  Zv 
axes  and  are  represented  by 

/ 10  0  \ 

R x(6)  =  (  0  cos  6  -sin  6  I 
\0  sin  6  cos  6  ) 

/  cos  6  0  sin  6  \ 

Ry  (0)  =  0  10 

\  -sin  6  0  cos  6  ) 

/cos  6  -sin#  0\ 

Rz(0)  =  sin  6  cos  6  0 

\  0  0  1/ 

where  6  is  a  rotational  angle. 

The  tilt,  roll  and  pan  angles  were  identified  based  on 
a  stereo  image:  the  platform  tractor  was  set  stationary  on  a  flat 
surface  and  acquired  a  stereo  image  of  the  surface.  A  flat 
surface  on  which  the  vehicle  was  placed  represents  XVYV 
plane,  therefore,  the  3D  data  of  the  flat  surface  calibrated  with 
the  optimum  tilt  and  roll  angles  have  to  minimize  following 
equation: 

E  =  ^(zv  -  zv)  (2) 

where  zv  is  the  average  of  zv.  The  value  of  zv  in  Eq.  (2)  was 
calculated  by  Eq.  (1)  with  dz  =  0.  With  dz  =  0,  zv  is  essentially 
the  camera  height  dz. 

The  tilt  and  roll  angles  that  minimize  E  of  Eq.  (2)  could  be 
identified  by  some  optimization  algorithm.  A  gradient  descent 
was  used  in  this  study  (Jongen  et  al,  2004).  Gradient  descent 
finds  a  local  minimum  of  a  function  so  it  is  important  to  have 


Fig.  2  -  Creation  of  a  3D  crop  row  image:  (a)  original  image  (left  camera);  (b)  disparity  image;  and  (c)  resulting  3D  crop  row 
image.  The  grey  level  in  (b)  and  (c)  indicates  the  crop  height:  the  brighter  in  colour,  the  higher  in  height. 
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Fig.  3  -  Camera  installation  and  coordinate  systems 
definition:  CG;  dx,  dy  and  dz,  longitudinal,  horizontal  and 
vertical  offsets  of  the  camera  location  in  vehicle 
coordinates;  «,  0  and  7,  roll,  tilt  and  pan  angles  of  camera 
installation. 


a  good  initial  value.  However  in  this  application,  it  would  not 
be  a  problem  because  the  camera  mount  angle  could  be 
measured  manually  with  a  decent  accuracy.  As  a  result,  each 
parameter  was  identified  as  a  =  -0.5°,  0  =  41.5°,  7  =  0.0°, 
dx  =  0.04  m,  dy  =  2.65  m,  and  dz  =  2.22  m. 

2.4.  Feature  point  tracking 

The  camera  motion  was  estimated  by  means  of  tracking 
the  feature  points  in  two  adjacent  stereoimages.  A  feature 
point  is  defined  as  a  pixel  that  has  a  distinctively  different 
value  from  its  neighbour  pixels.  Feature  points  can  be 
detected  by  a  feature  point  detector,  which  quantifies  the 
distinctiveness  of  the  pixel  based  on  surrounding  pixel 
values  (Harris  and  Stephens,  1988;  Shi  and  Tomashi,  1994). 
By  searching  for  similar  features  among  two  adjacent 
images,  corresponding  points  between  two  images  can  be 
determined.  Feature  point  tracking  algorithms  are  often 
found  in  machine  vision  applications  such  as  face  recog¬ 
nition  (Moriyama  et  al,  2006),  motion  detection  (Poelman 
and  Kanade,  1997)  and  image  mosaicing  (Kise  and  Zhang, 
2008). 

Feature  points  of  this  study  were  chosen  based  on  the 
magnitude  of  the  gradient.  A  gradient  was  calculated  by 
applying  a  derivative  filter  to  the  pixel  of  interest  and  its 
neighbours.  A  3  x  3  Sobel  mask  was  used  as  the  derivative 
filter  (Gonzalez  and  Woods,  1992).  Feature  points  found  in  one 
image  were  searched  for  in  the  following  image.  Each  feature 
point  found  in  preceding  image  (image  0)  was  compared  with 
feature  points  in  the  following  image  (image  1)  for  their 
similarities  to  determine  if  they  represent  the  same  point.  To 
compare  their  similarities,  a  feature  point  and  its  neighbour 
pixels  in  image  0  were  warped  to  the  image  1  coordinates.  If 


image  0  was  a  3D  image  and  the  geometric  relationship 
between  the  two  cameras  (image  0  and  1)  was  known,  it 
should  be  possible  to  reconstruct  a  hypothetical  camera  that 
had  a  view  of  image  1  from  their  geometric  relationship 
(Scharstein,  1999). 

If  the  two  feature  points  represent  the  same  point  in  the 
field,  in  theory  the  warped  view  from  the  image  0  is  identical 
with  the  view  of  the  image  1.  The  warped  image  is  compared 
with  an  equal  size  window  centred  at  the  feature  point  in  the 
image  1.  The  sum  of  the  squared  difference  (SSD)  of  grey  levels 
across  the  windows  was  used  as  an  indicator  of  the  similarity 
of  the  two  windows. 

x0+u;  yo+w 

S(x0,yo,i,dy)  =  E  E  {h(iJ)-h(i  +  dxJ  +  dy)j2  (3) 

i=x0-u;  j= y0-w 

where  S(x0,  y0>  dx,  dy)  is  SSD  between  the  window  centred  at  (x0, 
y0)  in  image  0  and  the  window  centred  at  (x0  +  dx,  y0  +  dy)  in 
image  1,  2u;  + 1  is  the  window  size,  I0(i,  j  ),  and  I^i,  j  )  are  the 
grey  level  of  pixel  (i,  j  )  in  image  0  and  image  1,  respectively.  In 
this  case,  (x0,  yo)  in  image  0  and  (x0  +  dx,  y0  +  dy)  in  image  1 
were  the  feature  points  to  be  compared  for  their  matching. 
Obviously,  the  smallest  S  occurred  at  the  best  match  of  the 
points. 

Fig.  4  shows  an  example  of  feature  point  tracking.  A 
gradient  was  calculated  at  all  pixels  in  the  image,  and  400 
largest  gradients  were  selected  as  the  feature  points  of  the 
image  (Fig.  4(a)).  As  a  result  of  the  feature  point  tracking,  89 
corresponding  points  were  found  in  the  consequent  image 
(Fig.  4(b)). 

2.5.  Tractor  motion  estimation 

It  is  necessary  to  define  how  to  represent  the  camera  motion 
to  support  the  estimation  process  based  on  the  corresponding 
feature  points  between  two  consequent  images.  Fig.  5  pres¬ 
ents  the  definition  of  a  motion  vector  in  terms  of  the  camera 
motion  information  extracted  from  two  adjacent  images 
based  on  a  simple  tractor  dynamics  model.  This  motion  vector 
could  represent  both  vehicle  translational  and  rotational 
motions  with  reference  to  the  vehicle  coordinates  at  the 
image  0,  as  follows: 

m  =  M]T  (4) 

where  m  is  a  motion  vector,  I  is  the  translational  distance  and 
0  is  the  rotational  angle  of  the  tractor. 

The  tractor  dynamics  model  shown  in  Fig.  5  was 
designed  under  the  assumption  that  the  tractor  travels  on 
a  flat  surface  so  that  the  vertical  translation  and  rotations 
about  Xv  and  Yv  axes  (pitching  and  rolling)  of  the  tractor 
could  be  ignored.  With  corresponding  points  identified  by 
the  feature  point  tracking,  the  geometric  relationship  of 
a  corresponding  point  between  two  images  can  be  expressed 
by  as  follows: 

fxo)  =  {  cosfi  sin  0  \  /  xa  \  ,  7  sin  0  \  ^ 

^yoj  \  -sin  0  cos  cf>  J  [y,  J  ^  [cos  cf>  J  u 

where  [x0,  y0]T  and  [x1}  yJT  are  the  location  of  the  corre¬ 
sponding  points  in  image  0  (X^Y^  coordinates)  and  image  1 
(xi^Yi1)  coordinates),  respectively.  Consider  that  0  and  I  are 
small,  and  ignore  second-order  terms;  Eq.  (5)  becomes 
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Fig.  4  -  Feature  point  tracking:  (a)  400  largest  gradients  of  all  pixels  are  selected  as  feature  points  and  (b)  as  the  result  of  the 
feature  point  tracking,  89  corresponding  points  are  found  in  the  adjacent  image. 


f  x°  \  =  /  Xi  +  (j)y1  \ 

\yo  J  V-0Xi+y i  +  l) 


(6) 


The  motion  vector  can  be  calculated  by  means  of  a  least 
squares  method  based  on  the  multiple  set  of  corresponding 
points. 

N-l  2  2 

Q  =  (x0,fe  -  Xi)fe  -  0 yi,fe)  +  (yo,k  +  0xlife  -  y1)fe  -  l)  }  (7) 

where  Q  is  the  error  function  to  be  minimized;  [x0>fe,  y0)JT  and 
[x1)fe,  yi,fe]T  are  pair  of  corresponding  points  found  by  the 
feature  point  tracking;  and  N  is  the  number  of  pair  of  corre¬ 
sponding  point  found.  [1,  0]T  that  makes  Q  minimum  is 
expressed  by 

NEfero{xi,feyo,k  -yi^o*}  +  £fe=o  -yo,fc) 

0  = - 1 ; - 1 -  (8) 

( EL'o1  *,0  -N EL“o{(^,02+(yi,fe)2} 


From  Eqs.  (5),  (8)  and  (9),  the  travel  path  of  the  vehicle,  namely 
the  trajectory  of  the  origin  of  vehicle  coordinate  system 
(XVYVZV)  with  respect  to  field  coordinates,  could  be  repre¬ 
sented  in  recursive  form  based  on  dead  reckoning  (Mizushima 
etal,  2002): 

f oxt+1\  =  / cos0t  -sin 0t \  / oxt\  /0\ 

V  °yt+i )  v sin  cos  0t  )  v  °yt )  v  l<  J  no) 

ozt+1  =  ozt  v  ' 

fit+l  —  fit  +  0t 

with 

ox0  =  0 
oy0  =  0 
oz0  =  0 

ho  =  0 

where  t  is  image  frame  number  (ox,  oy,  oz)  is  the  origin  of 
vehicle  coordinate  system  with  respect  to  map  coordinates, 
and  h  is  the  angle  between  the  map  coordinate  system  and 
vehicle  coordinates  system.  Each  crop  row  structure  image 


Efe:o{Kfe)2+(yi,fe)2}  E^oCyi.*  -y°j.)  +  Eg^Eg 

( Eto  xy.)  -n ES{(*i,*)2+(yi,02} 


Fig.  5  -  Tractor  dynamics  model:  X^Y^,  vehicle 
coordinates  in  image  0;  X^Y^,  vehicle  coordinates  in 
image  1;  /,  rotational  angle  of  the  tractor;  I,  translational 
distance  of  the  tractor;  and . ,  motion  vector. 


-  yi^o*  \ 

- L  (9) 


could  be  merged  in  a  stepwise  fashion  to  form  an  entire  3D 
field  map.  If  there  was  an  overlapping  area  between  two 
images,  the  average  of  overlapping  pixels  was  used. 


3.  Results  and  discussions 

3.1.  Test  field 

Series  of  stereoimages  were  acquired  in  a  soya  bean  field  for 
validating  the  developed  algorithm.  The  test  field  was  located 
at  an  experimental  farm  at  University  of  Illinois  at  Urbana- 
Champaign.  It  was  divided  into  three  plots  in  terms  of  the  day 
of  planting.  Each  plot  had  40  rows,  80  m  long  of  soya  beans 
with  0.75  m  inter  row  spacing.  On  the  day  the  images  were 
collected,  the  crops  in  the  three  plots  were  at  44,  52,  and  65 
days  after  planting  (DAP),  and  consequently,  the  approximate 
crop  heights  were  0.55  m,  0.70  m,  and  0.80  m,  respectively. 
The  tractor  travelled  10  paths  in  each  plot  to  obtain  3D  field 
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Fig.  6  -  3D  crop  row  structure  map  created  by  stereovision- 
based  mapping  system;  90  m  x  80  m  entire  3D  map 
consisting  of  three  plots  of  44,  52,  and  65  DAPs.  The  grey 
level  indicates  the  crop  height:  the  brighter  in  colour,  the 
higher  in  height. 


images,  resulting  in  a  total  of  30  paths  for  the  entire  field.  The 
stereocamera  recorded  images  at  5  frames  s-1  with  the  tractor 
travelling  at  1.0  m  s-1.  The  tractor  was  equipped  with  an  RTK- 
GPS  to  record  the  paths  for  referencing.  The  height  of  plants 
was  measured  manually  at  90  locations  as  ground  truth  data. 
Thirty  single  plants  were  selected  randomly  for  each  plot,  and 
the  height  of  the  plant  from  the  ground  to  the  top  of  the  main 
plant  stem  was  measured  by  a  ruler.  The  locations  of  the 
measured  plants  were  also  recorded  by  RTK-GPS. 

3.2.  3D  crop  row  structure  map 

Fig.  6  shows  a  3D  soya  bean  crop  row  structure  map  created 
from  the  stereoimages.  A  grey  level  of  the  map  indicates  crop 


heights  at  the  corresponding  locations  in  the  field  with 
a  brighter  pixel  representing  a  higher  point.  The  spatial  reso¬ 
lution  of  the  map  is  1.5  cm  pixel-1.  The  vertical  direction 
(upward)  corresponds  to  the  northing. 

The  map  of  each  single  path  was  created  individually,  and 
then  combined  side-by-side  to  form  the  entire  field  map.  The 
total  30  constituent  paths  comprise  the  entire  map.  Each  path 
was  wide  enough  to  contain  four  rows.  The  images  were 
always  recorded  from  the  north  side  of  the  field  in  order  to 
align  each  single  path  map. 

As  expected,  the  height  differences  in  three  plots  (44,  52, 
and  65  DAP)  were  apparent  in  the  map:  Plot  1  (44  DAP)  was  the 
lowest  height,  while  plot  3  (65  DAP)  was  the  highest  in  the 
field.  The  map  also  shows  the  intra-plot  variability.  Fig.  7(b)  is 
the  enlarged  view  of  the  portion  of  the  map  corresponding  to 
a  2.7  m  x  4.8  m  rectangle  region  at  the  right  edge  of  plot  3.  As 
shown  in  the  original  stereo  image  (Fig.  7(a)),  one  row  was 
distinctively  smaller  than  the  rest  of  the  rows,  as  a  result  of 
being  under-fertilized.  The  enlarged  map  demonstrated  this 
size  difference;  the  crops  in  the  second  row  from  the  left  in  the 
map  were  darker  (which  means  lower  height)  and  thinner 
than  those  in  the  other  three  rows. 

The  enlarged  map  also  shows  that  not  only  could  the 
height  of  just  one  plant  be  measured  but  also  its  volume.  For 
example,  the  total  canopy  volume  of  Fig.  7(b)  could  be  calcu¬ 
lated  from  simply  summing  all  pixel  values  in  the  figure. 

Fig.  8  shows  the  comparison  of  crop  heights  of  90  points 
estimated  from  the  field  3D  map  against  the  ground  truth  data 
measured  manually.  The  validation  showed  that  the  root 
mean  squared  (RMS)  error  between  the  3D  map  values  and  the 
ground  truth  data  was  0.04  m,  with  the  maximum  error  of 
0.09  m.  This  validation  result  proved  that  the  3D  field  mapping 
system  developed  in  this  research  could  provide  centimetre- 
level  crop  plant  height  information  with  a  high  spatial  reso¬ 
lution  in  the  form  of  a  panoramic  field  view. 

3.3.  Tractor  motion  estimation 

The  accuracy  of  travel  speed  and  heading  angle  obtained  by 
tractor  motion  estimation  determines  the  mapping  accuracy 
in  terms  of  plant  location  (XY  plane).  Fig.  9  shows  the 


Fig.  7  -  Enlarged  view  of  the  3D  crop  row  structure  map:  (a)  the  original  field  scene  taken  by  the  stereocamera  and  (b) 
enlarged  portion  (2.7  m  x  4.8  m)  of  the  3D  crop  row  structure  map  corresponding  to  the  white  rectangle  area  in  Fig.  7(a).  The 
grey  level  in  (b)  indicates  the  crop  height:  the  brighter  in  colour,  the  higher  in  height. 
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Estimated  crop  height  [m] 


Fig.  8  -  Comparison  of  the  crop  height  from  the  3D  crop  row 
structure  map  against  the  ground  truth  data. 


Frame  number 


Fig.  10  -  Performance  of  yaw  angle  estimation  using  the 
motion  recovery  algorithm. 


in  image  acquisition.  90%  of  the  errors  were  distributed  within 
a  range  of  ±0.67°  with  the  RMS  error  of  0.34°. 

Very  similar  results  were  obtained  from  all  three  plots:  the 
average  RMS  speed  and  yaw  angle  estimation  errors  were 
0.11  m  s-1  and  0.38°  from  plot  1,  0.13  m  s_1  and  0.44°  from  plot 
2,  and  0.12  m  s-1  and  0.37°  from  plot  3,  respectively.  The 
results  proved  that  the  motion  recovery  algorithm  could  reli¬ 
ably  estimate  both  travel  speed  and  yaw  angle  of  a  tractor  at 
a  sufficient  accuracy  for  guidance  applications. 


estimated  travelling  speed  of  the  tractor  using  Eq.  (9)  in 
comparison  to  the  RTK-GPS  data.  The  results  indicated  that 
the  maximum  error  of  speed  estimation  was  within  0.30  m  s-1 
for  tractor  travelling  speed  range  between  0.70 ms-1  and 
1.10  m  s-1.  Fig.  9  also  revealed  that  the  estimation  errors  were 
not  uniformly  distributed  but  tend  to  have  a  relatively 
constant  offset  for  each  trip.  This  offset,  observed  in  almost  all 
test  runs,  was  presumably  caused  by  the  data  acquisition 
program  that  didn’t  acquire  the  stereoimages  at  a  constant 
rate— the  image  acquisition  rate  might  periodically  change 
during  the  data  acquisition— resulting  in  the  constant-offset- 
error  of  the  travel  speed  estimation.  The  data  indicated  that 
90%  of  the  errors  were  distributed  within  a  range  of 
±0.20  m  s-1,  and  the  RMS  error  was  0.11  m  s_1. 

Fig.  10  shows  the  comparison  of  the  estimated  yaw  angle  to 
the  FOG  sensor.  In  contrast  to  travel  speed  estimation,  the 
estimated  yaw  angle  did  not  result  in  any  noticeable  constant- 
offset-error.  Such  a  difference  could  be  attributed  to  the  fact 
that  the  yaw  angle  estimation  was  a  result  of  accumulation  of 
angular  changes  between  consecutive  stereoimages,  which 
made  the  resulting  angle  less  sensitive  to  the  irregular  timing 


Fig.  9  -  Performance  of  travel  speed  estimation  using  the 
motion  recovery  algorithm. 


4.  Conclusion 

The  development  of  3D  field  mapping  system  for  panoramic 
3D  crop  row  structure  map  creation  was  reported.  Supported 
by  the  developed  algorithm,  a  tractor-mounted  stereocamera 
could  acquire  a  stream  of  field  stereoimages  and  create  a  3D 
crop  row  structure  map  of  an  entire  field  using  the  acquired 
images.  This  map  of  crop  row  structure  could  not  only  be  used 
to  measure  crop  plant  height  and  volume  at  a  high  spatial 
resolution  (1.5  cm  in  horizontal  plane  and  5.0  mm  in  vertical 
axis)  and  a  high  accuracy  (4.0  cm  in  plant  height  estimation), 
but  also  provide  vehicle  guidance  information  with  sufficient 
accuracy  (less  than  0.13  ms-1  and  0.44°  for  travelling  speed 
and  yaw  angle  estimation).  The  results  proved  that  this 
developed  algorithm  could  be  a  very  useful  tool  for  many 
applications,  e.g.  plant  growth  condition  monitoring  by 
comparing  two  maps  created  from  different  dates  and  plant 
volume  calculation  for  biomass  productivity  estimation.  A 
study  on  addressing  the  use  of  GPS  for  compensating  drift 
error  could  further  improve  the  accuracy  of  system. 
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